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1.  INTRODUCTION 


Corresponding  to  any  tinae  series  {Jfj;  i  -  0,1,2.  •  ■  •  }  is  the  time  series 
;  j  =  0,1,2,  •  •  ■  where 

X,  =  (1) 

k-l 

is  the  y  **  batch  mean  of  size  6 .  The  batch-means  process,  or  aggregated  time  series,  is 
of  interest  when  observations  are  actually  batched  (see,  e.g.,  Telser  [1967],  Amemiya 
and  Wu  [1972],  and  Tiao  [1972]),  when  calculations  need  to  be  simplified  (see,  e.g., 
Blackman  and  Tukey  [1958,  Sec.  B.  17]),  or  when  the  process  mean,  E(X),  is  to  be 
estimated.  The  third  context  motivates  our  work. 


_  fl 

Consider  estimating  E  (X)  with  the  average  of  n  observations.  X  =  n.  Using 

»*1 


batch  means  to  estimate  the  variance  of  the  sample  mean.  V  (X),  has  long  been  con¬ 
sidered  (see,  e.g.,  Conway  [1963]).  Brillinger  (1973)  shows  that  if  the  values  of  a  process 
at  a  distance  from  each  other  are  only  weaJcly  dependent,  then  the  batch  means  are 
asymptotically  independent  and  normally  distributed.  Thus,  for  large  batch  size  6 , 


V  (X)  can  be  estimated  using  5**/  1,  where  k  =  [n  /  ij,  5%  =  (* -l)”'|]C-X^,  -  *-*^*j,  and^.J 

denotes  the  floor  function.  (Moran  [1965]  discusses  related  estimators.)  A  nominal 
lOO(i-a)  percent  confidence  interval  on  E(X)  is  then  X  ±  5*  I"*''*,  where 

‘i-a/  2.*-i  is  the  1  -  (a/  2)  quantile  of  Student’s  t  distribution  with  k-i  degrees  of  freedom. 


The  batch  means  algorithms  developed  by  Mechanic  and  McKay  (1966),  Fishman 
(1978),  Law  and  Carson  (1979),  Schriber  and  Andrews  (1979),  and  .4dam  (1983)  empiri¬ 
cally  calculate  me2isures  of  batch  dependency  for  various  batch  sizes  6  in  an  attempt  to 
determine  a  reasonably  small  value  of  6  that  yields  batch  means  that  are  almost 
independent  and  normally  distributed.  These  procedures  require  substantial  calculation; 
Law  and  Carson  (1979),  for  example,  calculate  first-order  correlations  based  on  400 
batches.  That  so  many  batches  are  required  for  accurate  estimation  of  dependency 
measures  is  unfortunate,  since  Schmeiser  (1982)  shows  that,  for  fixed  n,  additional 
batches  beyond  some  small  number  (ten  to  thirty)  do  little  to  improve  the  statistical 
properties  of  the  batch  means  confidence  interval  procedures.  The  results  of  Section  2 
are  motivated  by  the  idea  that  knowledge  of  the  relationship  between  {A', }  and  {.Y, }  can 
be  used  to  measure  properties  of  { X,  }  even  for  small  values  of  k  . 

A  second  reason  for  studying  the  relationship  between  {X,}  and  {X,}  is  to  allow 
more  efficient  simulation  studies  of  batch-means  procedures.  Studying  the  performance 
of  several  batch-means  procedures  in  the  context  of  various  distributions  zissumptions 
for  {.Y, )  requires  a  large  computational  effort,  especially  when  the  large  sample  sizes 
required  to  simulate  a  system  and  the  large  number  of  replications  required  for  meaning¬ 
ful  conclusions  are  considered.  .4  crude  Monte  Carlo  method  is  to  generate 
.Y,..Y2,  .X„  and  calculate  the  batch  means  .Y,,.Y;.  -Yin  - j|  for  all  values  of  6  of 

interest.  A  computationally  more  efficient  alternative  is  to  derive  the  properties  of  {.Y,  } 
from  the  properties  of  {X,  }  and  to  generate  directly  the  batch  means  {.Y, }.  as  discussed 
in  Section  4. 


A  third  motivation  is  that  direct  insights  might  result  from  studying  properties  of 
}  as  functions  of  the  parameters  of  {X, }.  Particularly  interesting  is  the  sensitivity  of 
{ Jy }  to  the  underlying  process  and  to  the  batch  size  6 ,  as  discussed  in  Kang  (1984, 
Chapter  5). 

Section  2  contains  results  relating  batch-means  processes  to  arbitrary  stationary 
autoregressive  moving-average  (ARMA)  processes.  Section  3  considers  the  special  case 
of  the  underlying  process  being  ARMA  (1,1).  Section  4  is  a  summary  containing  an 
algorithm  for  determining  the  batch-means  process  from  the  underlying  process. 

2.  BATCH  MEANS  OF  STATIONARY  ARMA  PROCESSES 
The  ARMA  (p  ,q)  process  by  definition  satisfies 

where  <t>o=  1,  6o  =  1?  and  the  error  terms  t,  are  independent  with  zero  mean  and  vari¬ 
ance  a,^.  The  main  result  of  this  paper  is  Theorem  1,  which  states  that  batch- means 
processes  arising  from  ARMA  underlying  processes  are  themselves  ARMA  and  specifies 
the  parameter  values. 

Theorem  1.  Consider  the  stationary  ARMA{p  ,q)  process  of  equation  (2). 

Then  { Jy }  is  the  stavonary  ARMA{p  ,q)  process 

*=0  fc=0 

where  0q=\,  ^^e  batch-means  error  terms  t,  are  independent  random 

variables  with  zero  mean  and  variance  a’-  ,  and  q,  ' 

^ \  9  2-  ,9  -  are  functions  (of  the  parameters  of  the  underlying  process  and 

the  batch  size  b)  given  in  Lemmas  1,  2,  and  4>  respectively. 

The  proof  of  Theorem  1  requires  the  following  lemmas. 

Lemma  1  (Anderson  [1979a.  p.  155]).  If  the  underlying  process  {.Y, }  is  a 
stationary  ARMA{p  .q)  process,  then  the  batch-means  process  {.Y,  }  is  a  sta¬ 
tionary  ARMA(p  ,q)  process,  where  q  =  p  -[(p-?)/  6]- 

Anderson  uses  the  more  complicated,  but  equivalent,  expression  q  = 

Lemma  1  has  several  direct  implications,  as  discussed  in  the  Appendix. 

Lemma  2  (.\meniiya  and  Wu  [1972]).  The  AR  parameter.'^  ©1.^2.  ©p 
of  the  batch  means  process  {.Y,  }  are  the  coefficients  of  .B’’  of 

P 

respectively,  where  Qj.a,.  •  •  a ^  are  the  roots  of  the  charac- 

k  -  I 

P 

(eristic  equation  ^  W'’  * 


0. 


Lemma  3.  For  any  stationary  process  {X, },  the  lag-h  autocorrelation  of 
the  batch-means  process  {Y, }  is 

—  —  r  * 

Pi,  =  CojT(Xj,  Xj+i,)  =  E‘>(k-t)»+i+  E‘>(a -!)*+»-.  f\^^\ 

i»l  1-1 

where  e  =  1  +  2  (!-(*/ M)/>* »  p^  =  Rh/ Ro,  and 

h^l 

Rh=E  ((JC,  -E  jr)(jf,  +k  -E  )1  for  h  =  0,1,2,  •  •  •  and  i  =  0,1,2,  •  . 

Proof.  For  any  stationary  process,  the  lag-A  covariance  of  the  batch 
means  process  is 

Cov(J,.,  6-*  EECov(X(,_,)t^,,  (3) 

i-U*! 

(see.  e.g..  Kleijnen  [1975,  p.  507]),  which  is  a  special  case  of  the  covariance 
of  linear  combinations  of  random  variables  as  discussed  in  Box  and  Jenkins 
(1976,  pp.  28-29).  Also,  for  any  stationary  process,  each  batch  mean  Xj 
has  variance 

Ro=\  IX^)=  e  Ro/  b  (4) 

(see,  e.g.,  Fishman  [1973,  p.  28l|).  The  definition  of  correlation 
P,  =  Cov  (X, .  r,  ^  J  /  (V  [Xi  )V  (Xy  )!>/  * 
and  equations  (3)  and  (4)  yield 

t  »  • 

Pix  =  6'*E  E^°''  iA(y -I)*  +,  ,  A (,  +*_,)*+*  I  /  |c  Rq  /  6  ] 

I  •  u  >  I 

=  E  -1)* *<>  +*-!)*+* I  /  !• 

Counting  like  terms  arising  from  stationarity  yields  the  result.  □ 

Lemma  4  (Anderson  [1971,  p.  237]).  Consider  a  stationary  ARMA[p  ,q) 
process  with  known  AR  (autoregressive)  parameters  variance 

Rq,  and  autocorrelation  coefficients  ■  ,p^.  Then  the  MA  (moving- 

average)  parameters  6  1,62-  '  '  ^  are  determined. 

Theorem  1  can  now  be  proven  using  the  four  lemmas. 

Proof  of  Theorem  1. 

1.  From  Lemma  1,  the  .\R  and  MA  orders  of  {Xy}  are  determined:  in 
particular,  {A, }  is  an  ARMA(p  ,? )  process. 

2.  From  Lemma  2.  the  AR  parameters  of  {.Vy  }  are  determined. 

3.  The  autocorrelations  pi.pj,  of  a  stationary  ARMA  process 

can  be  calculated  using  the  algorithm  of  Sweet  and  Mazaheri  (1979). 


4.  Given  the  autocorrelations  from  Step  3,  the  batch-means  variance  Rq 
is  determined  by  equation  (4)  and  the  batch-means  autocorrelations 

are  determined  by  Lemma  3. 

5.  The  MA  parameters  are  then  determined  from  Lemma 

4.a 

The  representation  of  the  batch-means  process  is  not  unique.  The  batch-means 
MA  parameters  of  Lemma  4  require  the  solution  of  a  polynomial  equation  of  order  2g  to 
determine  The  2q  roots  can  be  partitioned  2*  ways  into  two  sets 

(z,,x2,  •  •  •  ,Zj)  and  such  that  =  1/  z,.  All  such  subsets  of  size  q 

from  (zi.zj,  ■  •  •  ,Zjj)  determine  ‘  corresponding  to  stochastically  equivalent 

processes.  But  there  is  a  unique  subset  of  size  q  having  [  z,  i  $  l,  and  therefore 
i  '  =  I  1/  *i  I  ^  for  ‘  =  •  .4-  Thus  for  the  ARMA(p,g)  process  there  exist 

2*  -  1  non-invertible  processes  corresponding  to  a  unique  invertible  process. 


3.  ARMA(1,1)  BATCH  MEANS 


Now  consider  the  special  case  of  stationary  first-order  ARMA  processes 

=  < i  +  €,_i  for  I  =  1.2,  •  ■ 

The  low-order  moments  of  {X, }  are  the  zero  mean,  variance 


Ro  =  <r/(l  9^  -2<l>e)  /  (!-«>*)  , 


the  lag-one  autocorrelation 


py  =  -  <i>e)(e  -  ,i>) !  (\  + -  2<i>e) .  (7) 

and  lag-h  autocorrelations 

Ph  =  (-<tt]Pk-\  =  for  h  =  2,3.  (8) 

Closed-form  expressions  for  the  parameters  of  the  batch-means  process  are  given  in 
Theorem  2. 

Theorem  2.  Consider  the  stationary  ARMA(l.l)  process  of  equation  (5). 

The  corresponding  batch-means  process  is  the  stationary  ARMA (1,1)  pro- 


where 


X,  -  =  f ,  -  for  >  =  1,2, 

{2Sp,  -  (I  <?=)(!  (2p,-^)*)  '  - 


where 


ffj.*  =  (1-?*)  /  (P  -  2^e  +  1) 


(12) 


N 


«0=  c  Rol  (13) 

Pi=  /  \bc(l+4>)^]  (14) 

and 

e  =  1  +  {2^,|(-^)‘  +  6  (1+^)  -  1]  /  |6  (1+^)*!}  (15) 

Proof  of  Theorem  2. 

Equations  (13),  (14),  and  (15)  follow  from  Lemma  3  and  equation  (4) 
via  equation  (8).  Since  {AT,  }  is  ARMA(l,l),  {X, }  is  also  ARMA(l,l)  by 
Lemma  1;  that  is.  equation  (9)  holds.  Equation  (10)  is  a  special  case  of 
Lemma  2.  Since  {X, }  is  ARMA(l,i),  equation  (7)  yields 

p,  =  (i+P-2^f)  . 

Solving  for  yields  the  two  roots  of  equation  (11).  Since  {X,  }  is  an 
ARM  A  (1,1)  process,  equation  (6)  holds  with  the  batch- means  parameters: 

^0=  ff^^(l+P~2^f)  /  (l-P)  . 

Solving  for  the  variance  of  the  batch-means  error  term,  aJ,  with  either 
value  of  from  equation  (11)  yields  equation  (12).  (But  note  that  the 
value  of  aJ  depends  on  the  choice  of  ^).  □ 


4.  SUMMARY 

Properties  of  batch  means  are  studied  under  the  assumption  that  the  underlying 
process  is  ARMA(p,7).  For  ARMA(i,l)  processes,  closed-form  expressions  for  the 
corresponding  batch-means  processes  are  obtained.  A  numerical  procedure  is  developed 
for  calculating  the  parameters  of  the  ARMA  batch-means  process  from  the  parameters 
of  the  underlying  process  and  the  batch  size  6 .  This  procedure  is  stated  concisely  here 
for  convenience. 


ARMA(p,q)  Procedure. 


Given  parameters 
size  6 ,  calculate 


error  variance  cr,*,  and  batch 


1.  g  =  p  -l(p  -q)/  b\ 

2.  $,,02,  '  ,$p  using  Lemma  2 

3-  Rq,p\,P2.  I  Yule-Walker  equations,  probably  using 


the  algorithm  of  Sweet  and  Mazaheri  (1979) 


4.  c  =  \  ^  2E(l-(* 
*  - 1 


8.  <7-*  from  equation  (13)  of  Anderson  (1971,  p.  237). 

A  FORTRAN  implementation  of  the  ARMAlp,?)  procedure  is  given  in  Kang  (1984). 

When  the  underlying  process  is  ARMA(l,l),  the  following  closed-form  procedure 
C2in  be  used: 

ARMA(1,1)  Procedure. 

Given  parameters  4>,  6 ,  error  variance  and  batch  size  b ,  calculate 

1.  q  =  1 

2.  ^  from  equation  (10) 

3.  Ro  from  equation  (6),  p,  from  equation  (7) 

4.  e  from  equation  (15) 

5.  Ro  from  equation  (13) 

6.  p,  from  equation  (14) 

7.  $  from  equation  (11) 

8.  from  equation  (12) 

Notice  in  the  ARMA(l,l)  special  case  that  calculation  in  step  3  of  all  26-1  autocorrela¬ 
tions  of  the  underlying  process  is  not  necessary. 

If  the  underlying  error  terms  e,  are  normally  distributed,  then  the  batch-means 
error  terms  r,  are  also  normally  distributed  (see,  e.g.,  Johnson  and  Kotz  [1971,  p.  51]). 
Therefore,  generation  of  random  variates  directly  from  the  batch-means  process  is 
straightforward  using  equation  (9),  thereby  avoiding  the  costly  computations  of  aggre¬ 
gating  observations  from  the  underlying  process.  Initialization  for  steady-state  results  is 
straightforward  for  AR  and  MA  processes,  but  initialization  for  ARMA  is  complicated 
unless  the  process  is  warmed-up  by  discarding  some  initial  observations  (Anderson 
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APPENDIX 


Although  Lemma  1  is  simple  to  state  compactly,  its  implications  are  more  clear 
when  czises  are  considered  individually: 


If 

and 

then 

b<p-q 

q^q^p -i 

p  >q 

b=p-q 

1 
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b>p-q 

q  =  p 

p=q 

q  =  p 

b  <q-p 

p +2$  q  ^q 

p  <q 

b~q-p 

q  =  p^\ 

b  >q-p 

q=p  -1 

Many  results  can  be  stated  immediately  from  examination  of  these  individual 
cases.  Five  such  results  are: 

Result  1.  If  {X,}  is  AR{p),  then  }  is  ARMA(p  ,q),  as  studied  by 

Amemiya  and  Wu  (1972).  Additionally,  I  ^  q  ^  p  . 

Result  2.  If  {X, }  is  MA(q ),  then  {X^ }  is  MA(q),  where  l  ^q  ^  q  . 

Result  3.  If  {X, }  is  AR  or  ARMA  with  batch  size  satisfying  0  ^  p  -  q  <  b , 

then  {Xj }  is  ARMA[p  ,p). 

Result  4.  If  p  ^  q ,  then  lim  q  =  p  . 

b  — oc 

Result  5.  If  p  <  q ,  then  \\m  q  ~  p  +  l. 

b  -*00 

Of  course,  considering  only  the  order  of  the  batch-means  process  can  be  mislead¬ 
ing.  For  example.  Results  4  and  5  indicate  that  large  batches  lead  to  MA  components 
of  order  p  or  p+1;  in  particular,  a  batched  MAf?)  process  converges  to  an  MA(l)  pro¬ 
cess.  But  large  batches  are  asymptotically  independent.  The  explanation  is  that  8,  is 
approaching  zero  as  batch  size  increases.  An  implication  is  that,  even  for  this  nicest 
case  of  ARMA  underlying  processes,  estimation  of  the  order  of  the  batch-means  process 
is  likely  to  be  difficult. 
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